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Abstract 

Background: A classical problem in metabolic design is to maximize the production of desired compound in a 
given chemical reaction network by appropriately directing the mass flow through the network. Computationally, 
this problem is addressed as a linear optimization problem over the "flux cone". The prior construction of the 
flux cone is computationally expensive and no polynomial-time algorithms are known. 

Results: Here we show that the output maximization problem in chemical reaction networks is NP-complete. 
This statement remains true even if all reactions are monomolecular or bimolecular and if only a single molecular 
species is used as influx. As a corollary we show, furthermore, that the detection of autocatalytic species, i.e., 
types that can only be produced from the influx material when they are present in the initial reaction mixture, is 
an NP-complete computational problem. 

Conclusions: Hardness results on combinatorial problems and optimization problems are important to guide the 
development of computational tools for the analysis of metabolic networks in particular and chemical reaction 
networks in general. Our results indicate that efficient heuristics and approximate algorithms need to be employed 
for the analysis of large chemical networks since even conceptually simple flow problems are provably intractable. 



Background 

Networks of chemical reactions lie at the heart of 
"systems approaches" in chemistry and biology. Af- 
ter all, metabolic networks are merely collections of 
chemical reactions entrenched by enzymes that fa- 
vor some possible reactions over physiologically un- 



desirable side reactions. A detailed understanding of 
their aggregate properties thus is a prerequisite to 
efficiently manipulating them in technical applica- 
tions such as metabolic engineering and at the same 
time form the basis for deeper explorations into their 
evolution. Due to the size of reaction networks of 



Figure 1: Flow optimization in the pentose-phosphate reaction network. Only a small part of the chemical 
space is shown. We allow influx of water H2O and ribulose-5-phosphate to generate glucose-6-phosphate as 
output. Phosphate is produced as waste product. An optimal solution is shown in black, using 6 ribulose- 
5-phosphate molecules to produce 5 glucose-6-phosphate molecules. The values of the flow /( . ) is indicated 
for each hyperedge (black square), e.g., /(a) = 1, f{h) = 1, /(c) = 2, f{d) = 2, /(e) = 2. At each node 
(except the unlabelled input and output nodes) the influx and outflux is balanced. For example, at node x 
(glycerol-3-phosphate) , we have f{d) + /(e) = 4 = /(a) + f{b) + /(c). 



practical interest, efficient algorithms are required 
for their investigation. 

Chemical reaction networks cannot be modeled 
appropriately as graphs despite the many attempts 
in this direction Instead, they are canoni- 

cally specified by their stoichiometric matrix S, aug- 
mented by information on catalysts. Equivalently, 
a collection of chemical reactions on a given set of 
compounds forms a directed (multi)-hypergraph 2 . 
As a consequence, most of computational problems 
associated with chemical reaction networks cannot 
be reformulated as well-studied graph problems and 
hence require the development of a dedicated theory 
and corresponding algorithmic approaches. Math- 
ematical structures similar to the directed hyper- 
graphs arising in chemistry were also explored in a 
theoretical economics setting [sjji]. 

Two complementary approaches to analyzing 
chemical reaction networks have been developed 



mostly in the context of analyzing and manipulat- 
ing metabolisms. Flux Balance Analysis (FBA) is 
concerned with the distribution of steady-state re- 
action fluxes that optimize a biological objective 
function such as biomass or ATP production [5]. 
The objective of metabolic design is to manipulate 
fluxes through a metabolic networks so as to maxi- 
mize the production of a (commercially important) 
substance [g]. More details on the structure of a 
(metabolic) reaction network, on the other hand, is 
obtained my means of elementary mode analysis [t] . 
Both approaches are concerned with stationary mass 
flows through the network, mathematically given as 
solution of SiT, subject to the condition that flux 
Vi through every reaction is non- negative. The ele- 
mentary flux modes (EFMs) are the extremal rays 
of this convex cone (t and can be interpreted as a 
formalization of the concept of a "biochemical path- 
way" [8[|9]. FBA adds a (typically linear) objective 
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function to be optimized over (t. A major drawback 
of EFM-based approaches is the combinatorial ex- 
plosion of EFMs in large networks |10 and the fact 
that the knowledge of EFMs does not directly eluci- 
date the metabolic capabilities of the given network. 
An interesting recent approach thus combines FBA 
with the computation of a subset of EFMs using a 
greedy-like procedure 11 . 

Over the last years, there has been increasing 
interest in the computational complexity of ques- 
tions related to EFMs. For example, an elementary 
flux mode can be found and counted in polynomial 
time [12]. In contrast, the question whether there 
is a "futile cycle", i.e., an EFM without input or 
output (equivalently, a sub-hypergraph in which in- 
degree and out-degree balance for all vertices [2]), 
is NP-complete 13 . Similarly, finding EMFs that 
contain two prescribed reactions is NP-hard |14 . A 
collection of reactions is a reaction cut set for a given 
reaction if, after removing the cut set, the network 
contains no longer an EFM containing the target 
reaction 15, 16 . The problem of finding minimum 



cardinality reaction cut sets is also NP-complete 12 . 
The complexity of enumerating all EFMs is still un- 



known 14 . In 17 , the problem of finding a short- 



est metabolic pathway connecting a set of source 
metabolites with a desired product is shown to be 
NP-hard even if stoichiometric coefficients are ne- 
glected. 

An alternative approach to analyzing the struc- 
ture of chemical reaction networks is to decompose 
them into a hierarchy of algebraically closed and self- 
maintaining sub- networks, called chemical organiza- 
As shown in 
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19 , it is also an NP-hard 



tions 

problem to determine whether there is a a given re- 
action network contains a non-trivial organization. 

In this contribution we focus on a class of compu- 
tational problems in chemical network analysis that 
involve questions relating to both pathways and or- 
ganizational aspects. The problem of of maximizing 
production of a desired collection of output species 
(rather minimizing cardinality of reaction sets) is 
central to metabolic engineering 22 , see Figure [l] 
for an example. In contrast to ffow problems on sim- 
ple graphs [23,, we show here that hypergraph ver- 
sions describing ffuxes in chemical reaction networks 
are computationally hard. As a computational prob- 
lem, this ffow maximization problem is closely re- 
lated to the issue of ffnding autocatalytic intermedi- 
ates in a reaction network. The latter problem has 
received considerable attention in recent years since 



such "metabolic replicators" are universally found in 
present-day metabolic networks and and likely repre- 
sent their ancient ancestral cores 24 . We show here 
that detection of autocatalysts is NP-hard in its gen- 
eral version, although a related problem in the set- 
ting of replicator-like networks admits a polynomial- 
time solution 
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Result: NP-hardness 
Definitions 

In the following paragraphs we formally introduce 
chemical reaction networks. We emphasize that our 
setup is the same as in the literature on ffux analy- 
sis; we have opted, however, for a somewhat different 
notation that is closer to the conventions commonly 
used in graph theory as this makes the subsequent 
discussion more concise. 

A chemical reaction network (CRN) is repre- 
sented a directed multi- hypergraph G{V^E) consist- 
ing of a vertex set V, the compounds, and a set E 
of directed hyper-edges encoding the reactions [2]. 
Each reaction e G ^ is a pair (e~,e+) of multi- 
sets e~,e+ C ]/ of compounds, denoting the educts 
and products of the reaction e. The stoichiometric 
coefficients g- and 8^^^^+ are represented by the 
multiplicity of the compounds in the multisets. For 
instance, the hyperedge encoding 



C2H2 + 2H2O ^ {CH20H)2 



reads 



({C2H2, H2O, H2O}, {{CH20H)2}) 

Reversible reactions are encoded by a pair of forward 
and backward reaction. The entries of the stoichio- 
metric matrix are recovered as S^^^e = s^.^^-^ — s^^^- . 

In addition to the ordinary reactions like the one 
above, CRNs also contain pseudo-reactions repre- 
senting inffux and outffux of compounds of the form 

einix) = ({^m},{^}) and eout{x) = ({^}, {^ont}) 
where Xin and Xout refer to external reservoirs. 
These are additional vertices distinct from V. 
These pseudoreactions feed the CRN and remove 
"waste products" and extract a desired output. In 
particular, the x^^, yout ^ V do not take part in any 
other reaction. 

A ffow on the directed hypergraph G is a func- 
tion f : E U E^ ^ No such that, for each compound 
X eV, the condition 
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is satisfied. This condition enforce that the total pro- 
duction and the total consumption of x is balanced, 
i.e., the CRN is in a stationary state. The total con- 
sumption of an input material x is therefore 



f {^in{x) ) 
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f{e){Sx,e 
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and the total outflux of a product is 
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(2) 



(3) 



We say that a species x is produced in a network if 

f{eout{x)) > 0- 

Note that this definition of / naturally gener- 
alized the definition of an (integer) fiow on a di- 
rected graph with source Xin and target yout^ see 
e.g. ^23j. In [26|, a generalization of equ.(T]), al- 
though restricted to hypergraphs with |e+ =1, 
is considered, where the fiows add up to a vertex- 
dependent demand term rather than to zero. In con- 
trast to the usual setting of fiow problems, we have 
a non-trivial restriction on the capacity only for the 
input edge(s), while the values of / are unrestricted 
for all other hyperedges. 



Formulation of the problems 

MAX-CRN-Output Given a chemical reaction net- 
work with n nodes, of which any subset may have 
infiux or outfiux, find a fiow / that maximizes the 
outfiow f{eout{y)) to a specified output node yout- 
MAX-CRN ((i)-Output Given a chemical reaction 
network with n nodes reactions (hyperedges) with 
in-degree and out-degree at most where any sub- 
set of vertices may have infiux or outfiux, find a fiow 
/ that maximizes the outfiow f{eout{y)) to a specified 
output node yout- 

MAX-CRN ((i)-Output-l Given a chemical reaction 
network with n nodes, reactions (hyperedges) with 
in-degree and out-degree at most d, and a single ver- 
tex with infiux where any subset of vertices may have 
outfiux, find a fiow / that maximizes the outfiow 
f{eoutiy)) to a specified output node yout- 
Autocata Given a chemical reaction network with 
n nodes and one or more input sources, determine 
whether there is a source node x such that: 

1. X cannot be produced from all other source 
molecules, i.e., for all fiows /, f{ein{x)) = 
implies f{eout{x)) = 0; and 



2. X can be produced in a quantity that is larger 
than its infiow, i.e., there is a fiow / such that 

fi^outix)) > fi^inix)) > 0- 



Outline 

Formally, NP-completeness is defined for decision 
problems [?]. Optimization problems can be con- 
verted into decision problems by asking whether they 
admit a solution that is at least as good as some 
value. By abuse of language, it therefore makes sense 
to speak of an "NP-complete optimization problem" 
instead of using the phrase "the decision problem 
corresponding to our optimization problem is NP- 
complete" . 

The basic idea of proving that problem X is NP- 
complete is to find a so-called reduction p from an- 
other problem ^ that is already known to be NP- 
complete. The reduction p is an algorithm with poly- 
nomial runtime that converts any given instance of 
^ into an instance of X. An efficient (i.e., polynomial 
time) algorithm to solve (all instances of) X, there- 
fore would also provide an efficient solution for every 
instance P G ^ by simply reducing P to p{P) G X 
then solving p{P). Hence we can conclude that X is 
a hard problem when a known hard problem ^ can 
be reduced to it. 

In this section we devise a procedure that reduces 
every instance of the so-called 3-partition problem to 
a CRN with a single output pseudo-reaction in such 
a way that solving the output maximization prob- 
lem for the CRN also solves the 3-partition problem. 
Thus optimizing output in CRNs is at least as hard 
as solving 3-partition. The same basic construction 
is then modified to show that the CRN can be built 
in such a way that all reactions are monomolecular 
or bimolecular. We then employ the same construc- 
tion to show that problem remains hard even if only 
a single source is provided. A simple modification 
finally establishes the hardness result for finding au- 
tocatalytic compounds. 



3- Partition 

The 3-partition problem (3 PART) consists in decid- 
ing whether a given multiset of n = 3m integers 5^, 
i = 1, . . . , 3m can be partitioned into triples that 
all have the same sum. This problem is one of the 
most famous strongly NP-complete problems, i.e., it 
stays NP-complete even when the numbers in the 
input instance are given in unary encoding ^27^, i.e.. 
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Figure 2: Construction of a CRN from a given instance of 3 PART. (A) In the first step, an intermediate 
network consisting of input nodes, switch nodes (green diamonds), and waste nodes (open circles), and a 
single output sink (hexagon) is constructed. The input is encoded as capacity constraint on the l.h.s. input 
nodes (corresponding to the input numbers Si of 3PART and on the m top nodes (corresponding to 1/m 
of the sum of the inputs). A solution of 3 PART corresponds to a flow through this network that transport 
• Si to the output sink. (B) In the second step, each switch node is replaced by reaction network that which 
admits a non-zero flow only if Si copies of Qi and Zj are available. The reaction then produces Si copies of 
the output molecule O. Note that the "drainage reactions" as not shown in panel (B). These channel the 
Qj and Zj input material directly to the "waste material" sink whenever the reaction networks inside the 
switch node receives insufficient input to produce both Wi and Vij. 



their values grows not faster than a polynomial in 
the problem size n. This remains true when the Si 
are distinct 28 . If B denotes the desired sum of each 



subset then 3 PART remains strongly NP-complete 
even if for every integer B/A < Si < B/2 holds. 



Basic Construction 

Given an instance of 3 PART we construct the asso- 
ciated CRN in a step-wise fashion. The first step is 
a lattice-like labeled graph. Figure |2];A), that con- 
sists of one input node corresponding to each 5^, m 
auxiliary nodes Zj, each of which has an influx of 
(l/m)^^5^ = s/m, an output sink node, 3m x m 
switch nodes, 3m waste nodes at the right and m 
waste nodes at the bottom. These switch nodes have 
two inputs / from the left and u from above, and 
three outputs r towards the right, d downwards, and 
o into the output channel. Each of the switch nodes 



can be in one of two distinct states: either it 

off The node transmits all its left input to right 
and all its input from above downwards, no 
flow is then diverted towards the output, i.e., 
r = I, d = o = 0; OT 

on The node consumes its entire input from the 
left (and thus transmits nothing to the right), 
at the same time uses up a corresponding 
amount of the input from above, and diverts 
a corresponding amount towards the output, 
i.e., r = 0, d = u — I, o = I. 

All flux along the output channel is collected in the 
output node, i.e., given a particular state of the 
switch nodes, the flux into the output node is the 
sum of the fluxes consumed from the left. 

Lemma 1. An assignment of ''on" and ''off" to the 
3m X m switch nodes is a solution of the original 
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3PART problem if and only if the total flow in the 
output node O equals the maximally possible value 

5 = Ei Si- 
Proof Consider the CRN in Figure [2] with 3m x m 
switch nodes. Each column corresponds to one of 
the m desired subsets of the underlying instance of 
3 PART, each row corresponds to one the 3m integer 
values Sk. Note that any assignment of "on" and 
"off" to switch nodes will split the overall horizontal 
as well as the overall vertical inflow into two parts: 
a part directed to waste material and an output part 
directed to node O. Let wh (resp. wy) be the overall 
horizontally (resp. vertically) produced waste. For 
any assignment of "on" and "off" states to switch 
nodes s = f{eout{0)) ^ wh = f{eoutiO)) + is 
invariant. Obviously, if wh = wy = 0, then the out- 
flow /( 

^out{0)) to node O is maximal. Furthermore 
note that at most one switch can be in "on" state in 
each row. 

Consider an assignment of "on" and "off" to the 
switch nodes that corresponds to a solution of the 
original 3 PART problem. Thus exactly 3m switch 
nodes are in mode "on" (three per column and one 
per row). As one switch node per row i is in mode 
"on" , the outflux Si of node Qi flows to output node 
O and the waste produced horizontally in row i is 
0. As this is true for all rows, wh = wy = holds 
and the total flow in the output node O is s which 
is maximal. 

Assume that the flow in the output node is the 
maximal possible value 5, and therefore wh = wy = 
holds. This implies that exactly one switch node 
per row needs to be in mode "on" . As we can assume 
5/ (4m) < Si < s/(2m) exactly 3 switch nodes per 
column need to be in state "on" . The overall assign- 
ment is therefore a solution to the original 3 PART 
problem. □ 

Of course, the intermediate network in Fig- 
ure |2| A) is not (yet) an proper CRN. To achieve this 
goal, we have to replace the switch nodes by hyper- 
graphs that implement the high-level rule governing 
their behavior. 



Implementing switch-nodes 

Suppose the molecules emitted from the 3m input 
nodes are all of different types Q^, and distinguish 
the m types of inputs from above as Zj. Then the 
switch node {i^j) must implement a net reaction of 



the form 

SiQi + SiZj SiO (4) 

where O is the type of the output molecule. This net 
reaction can be split into four subsequent reactions: 

SiQi ^ Wi 
s 7 Z ^ y Vj ^ 

Xij SiO 

We see that the switch node (i, j) can be in the "on"- 
state only if it received at least Si copies of the in- 
put from the left and a matching number of input 
molecules from above. A graphical description of 
this partial network is shown in Figure |2|B). Since 
the input from the left is limited to Si copies of Qi, 
either none or a single molecule of the intermediate 
Xij is produced, depending on whether {i^j) is on 
or not. Clearly, for each z, only a single one of the 
switches (i, j) can be "on". 

Note that equ.([5| already provides the neces- 
sary device to complete the proof. If we insist that 
the CRN may use at most bi- molecular reactions, 
we have to find a way to implement the reactions 
SiQi Wi and Xij SiO by more restricted el- 
ementary reactions. This will the topic of the fol- 
lowing section. According to equ.([5| each diamond 
node is replaced by 3{si + 1) vertices, so that the en- 
tire network has 6m + 2m + 1 + m Yl^=i 3(s^ + 1) = 
8m + 35m + 3m^ + 1 nodes. Thus, all instances of 
3 PART for which s = s{m) is polynomially bounded 
in m can be reduced to a maximum output problem 
on an equivalent CRN. We explicitly use the fact 
that 3 PART is strongly NP-complete: we need that 
m is polynomially bounded by the network size n to 
ensure that 5, and thus the reduction to 3PART, re- 
mains polynomial. We know the maximal outflux of 
the CRN and can therefore use a simple guess-and- 
check argument to show that MAX-CRN-Output is 
in NP. Our discussion thus establishes 

Theorem 1. MAX-CRN-Output is strongly NP- 
complete when the number of inputs into the CRN 
and number of educts in a chemical reaction is un- 
restricted. 

We remark the our CRNs need to have at least 
two output nodes, one for the desired product and 
one to collect all waste products. 
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Restriction to bi-molecular reactions 

In this section we show that the problem does not 
become easier when the CRN has only a single input 
and all reactions are bi-molecular. To this end we 
further refine the reactions SiQi Wi^ Xij SiO. 
We will make use of two specialized types of edges 
that can be implemented by bi-molecular reactions. 

The first type of edge merges exactly k identical 
molecules into 1 molecule (the corresponding edges 
will be referred to as merge-edges) . The second type 
of edge expands one molecule to exactly k identi- 
cal molecules (expansion-edges). We first focus on a 
specific type of merge- and expansion-edges: merge- 
edges of type (2^ 1) can easily be implemented 
by u subsequent reactions f\i = 1, . . . , that itera- 
tively create (double-sized) molecules out of 2 iden- 
tical molecules. Formally, let I = Xi and O = Xu-\-i 
then is defined by 

2Xi^Xi+i, (6) 

and the corresponding fiow is chosen to be 
/*({X^, X^+i}) := 2^~*. Symmetrically, expansion- 
edges of type (1 2^) can be implemented by 
u subsequent reactions that split molecules repeat- 
edly into two equal molecules. These (2^ 1)- 
merge-edges (resp. (1 — > 2^)-expansion-edges) will 
in the following be used to implement the general- 
ized merge- and expansion-edges. 

Let bm-ibm-2 • ' -bo be the binary representa- 
tion of k > with m = [log/cj + 1, and let 
B = {ii, ^2, • • • , v} be the indices of all non-zero 
bits, i.e i e B with bi = 1. The underlying idea for 
the merging of k molecules of type / into 1 molecule 
of type O is to split the outfiow k of I into r indi- 
vidual fiows, i.e. k = X]j=i2*^~^. We remark that 
this representation is unique. These fiows of quan- 
tity 2*^~^,j = l...r are then individually reduced 
to fiows of size 1. The resulting r fiows of quan- 
tity 1 are then all merged to a fiow of one molecule 
of quantity 1. The implementation of generalized 
merge-edges is depicted in Figure [sj A). Expansion- 
edges that expand the fiow of one molecule of quan- 
tity 1 to a fiow of one molecule of quantity k can 
be implemented analogously. First, a fiow of quan- 
tity 1 of one molecule is changed into r fiows of 
quantity 1, then these r fiows are expanded to r 
fiows of quantity 2*^~^,j = 1, . . .r, and then these 
fiows are iteratively summed up. The details are de- 
picted in Figure [sJB). Clearly, merge and expansion 
edges can be employed for the refinement of reactions 



SiQi Wi^ Xij SiO in equ.([5|. The number of 
additional edges and nodes to implement a (/c ^ 1) 
merge-edge is 0(log^/c), as there are 0{logk) fiows 
after the split into individual fiows, and each indi- 
vidual fiow employs 0{logk) edges for the {k 1) 
merge (with k being a power of 2). Symmetrically a 
{1 ^ k) expansion-edge uses 0(log^ k) bi-molecular 
edges and additional compounds. Based on this 
polynomial extension and as all merge and expansion 
reactions are bi-molecular, we have the following 

Corollary 1. MAX-CRN(2)-0utput is strongly 
NP- complete. 



Restriction to a single input 

To show that MAX-CRN-Output is NP-complete 
even if we have a single input only, we require an ad- 
ditional edge type that is implemented by connecting 
a (/c ^ l)-merge-edge and a (1 ^ /c)-expansion edge 
in series. Such an edge ensures that exactly k (or ex- 
actly a multiplicity of k) input molecules react to the 
same number of output molecules. We will refer to 
these edges as (/c)-force-fiow-edges. Note, that such 
edges do not change the quantity of a fiow. The 
number of additional edges and nodes required to 
implement a (/c)- force- fiow edge is 0(log^ k). 

So far we assumed input nodes Qi with corre- 
sponding infiux i = 1 . . . , 3m, plus the m ad- 
ditional input nodes Zi , . . . , Zm with infiux s — 
(1/m) Si each. In the following we will describe 
how to extend the construction of the CRN based 
on an instance of the 3 PART problem {cmp. Figure 
[2| such that there is only a single input node. Note 
that all Si, m, and the infiux to nodes Zi are defined 
by the given 3 PART instance. 

Influx to nodes Qii In the extended CRN the 
nodes Qi will be internal nodes with infiux Si. In or- 
der to achieve this we will add a single input node Q 
with infiux s' , where s' is the integer representation 
of the concatenation of the r-bit binary representa- 
tion of all Si, i.e., 

3m 

s' = ^SiX2'^^'-^\ with r = max{ [log 5iJ}+l (7) 

Attached to node Q will be a subnetwork that splits 
the fiux s' into the fiuxes si, . . . , ssm by iteratively 
using the last r bit of the remaining fiux as infiux to 
a node Qi, and then divide the remaining fiux by 2^. 
The hypergraph structure to implement this with 
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Figure 3: Consider the binary representation bm-ibm-2 - - - bo of k > with m = [log/cj + 1. Let 
B = {ii, ^2, • • • , v} be the indices of ah non-zero bits, i.e., i G B with bi = 1. (A) Implementation of a 
{k 1) merge-edge. (B) Implementation of a (1 ^ k) expansion-edge. The red edges indicate (2* 1) 
merges and (1 2*) expansions, respectively. 



bi-molecular reactions only is depicted in Figure [4j 
All dashed lines with red rectangles indicate force- 
fiow-edges (the number in the rectangle indicates the 
enforced flow), all red edges with open arrowheads 
indicate merge- or expansion- edges. To enforce that 
exactly (and not a multiplicity) of Si molecules flow 
towards node Q^, the flow downwards needs to be 
maximized. This is done by introducing an addi- 
tional outflux node: the flux of quantity > 1 
towards O' is multiplied by a factor c, such that the 
additional overall non- waste outflux to dominates 
any other non-waste outflux. This can be ensured by 
choosing the factor c as the maximal possible influx 
to Q, i.e. c = 2^^^^ — 1 (the binary representation 
of c has r x 3m bit all set to 1). The number of 
additional edges and nodes is polynomially bound 
and the overall outflux of the extended network is 
then 53^ X c + Si. As all outflux can be easily 
merged in a binary fashion as applied in the defini- 
tion of expansion-edges, the resulting CRN has only 
a single input node and a single non-waste output 
node. 

Influx to nodes Zii In order to have nodes Zi 
{cmp. Figure |2| as internal nodes, we split the out- 
flux from node Q of quantity in two fluxes of quan- 
tity 5^ — 1 and 1 (by employing force-flow-edges), that 
will be directly merged again and be used as influx 
of quantity s' to node Q' . However, this simple split- 
ting procedure gives a flux of quantity 1. This simple 
flux is easily transformed into m fluxes of quantity 1, 
which are then multiplied hy s/m using expansion- 
edges, and then used as the input towards the inter- 
nal nodes Zi. 

Recall, that the number of nodes and edges 



needed for a force- flow- edge of quantity k is 
0(log^ k). The number of bits for the maximal flux 
on any force-flow-edge is 0{r x 3m). As 3PART 
is strongly NP-complete we can assume that all Si 
are polynomially bound in m, and therefore r G 
O(logm). Therefore the maximal flux on any edge 
is (9(2^ ^os^). The number of additional nodes and 
edges is therefore O(m^log^m) per force-flow-edge. 
As the construction needs 0(m) additional force- 
flow-edges, the overall number of additional nodes 
and edges is O(m^log^m). Therefore the following 
corollary easily follows: 

Corollary 2. MAX-CRN (2)-0utput-l is NP- 
complete. 



Autocatalysis 

The NP-completeness of detecting an autocatalytic 
species can be shown by expanding the CRN used 
for showing the NP-completeness of MAX-CRN (2)- 
Output-1. Let O be the output node, where a out- 
flux of X c + ^ • Si can be detected iff the un- 
derlying instance of 3 PART is solved. We add a 
merge-edge from O towards an additional node A' 
to create an outflux of exactly 1 from A\ The CRN 
is furthermore extended by the following two addi- 
tional reactions, where compound A is an input and 
an output node of the CRN. 

A' ^A 2B 
B ^ A 

The outflux of A' is 1, if and only if 
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Figure 4: Splitting the single 
influx s' to node Q' such that 
the influxes to the internal nodes 
Qi are sf. the influx to node 
Q is chosen to have the quan- 
tity = Yfi=i Si X 2^(^-1) with 
r = max{ [log Si J } + 1, i.e., s' 
is determined by the concatena- 
tion of binary representation of 
the values s^; force-flow edges 
are depicted as dashed lines la- 
beled with the enforced quan- 
tity, merge- (resp. expansion-) 
edges are depicted as red lines 
with open arrowheads labeled 
the quantiflcation of merging 
(resp., expansion); the constant 
c for the expansion towards node 
O is chosen such that the outflux 
in node O dominates the outflux 
of the original lattice CRN. 



1. Compound A cannot be produced from all 
other source molecules, i.e., for all flows 
/(em(A)) = implies f{eoutiA)) = 0, and 

2. two A can be produced if their is an inflow 
of one A, i.e., there is a flow / such that 
fieoutiA)) > fieiniA)) > 0. 

The construction of our reduction highlights the dif- 
flcult part in determining autocatalysts. This is not 
so much flnding the autocatalytic cycle itself but to 
ensure that the building blocks are provided from 



the "food source" through an in principle arbitrarily 
complicated sub-network. 



Concluding Remarks 

We have shown that the flow maximization prob- 
lem and the detection of autocatalytic species in 
chemical reaction networks are NP-complete com- 
putational problems. As a consequence , we cannot 
expect to flnd devise exact algorithms for these prob- 
lems that can be used efficiently on large chemical 
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reaction networks (unless P=NP, which is unhkely at 
best [29*). Our results match well with the observa- 
tion that many classical computational problems are 
hard on hypergraphs even though their analogs for 
simple graphs admit efficient exact solutions. Illus- 
trative examples are matching problems 30 , or the 



sparsest null space problem for integer matrices 31 



which can be seen as the natural generalization of 
the minimum cycle basis problem. As graph models 
of chemical networks tend to be oversimplifications 
that are often of limited use 1 , the hardness of the 
computational task associated with the analysis of 
large reaction networks cannot be avoided. As exact 
algorithms appear out of reach, it will be necessary 
to systematically explore efficient approximation al- 
gorithms and heuristics for the combinatorial prob- 
lems naturally arising from Systems Chemistry. 
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